Method and apparatus for managing audio signals

ABSTRACT

A method comprising: detect a first acoustic signal by using a microphone array; detecting a first angle associated with a first incident direction of the first acoustic signal; and storing, in a memory, a representation of the first acoustic signal and a representation of the first angle.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a Continuation of U.S. patent application Ser. No.15/045,591 filed on Feb. 17, 2016 and assigned U.S. Pat. No. 9,601,132issued on Mar. 21, 2017 which claims the benefit of the earlier U.S.patent application Ser. No. 14/840,336 filed on Aug. 31, 2015 whichclaims priority from and the benefit under 35 U.S.C. § 119(a) of KoreanPatent Application No. 10-2014-0115394, filed on Sep. 1, 2014, which ishereby incorporated by reference for all purposes as if fully set forthherein.

BACKGROUND 1. Field of the Disclosure

The present disclosure relates to electronic devices, and moreparticularly to a method and apparatus for managing audio signals.

2. Description of the Prior Art

Recently, the electronic device has provided a function to recordanother party's voice at the usual time or during a phone call, as wellas basic functions, such as telephony or sending messages, to a user.

The electronic device includes a microphone for voice recording. Theelectronic device includes a plurality of microphones in order tothoroughly record audio signals. The plurality of microphones recognizesthe direction of a speaker, and implements beams in the direction tothereby thoroughly record a voice that comes from the direction of thespeaker. The beams may be implemented by applying a weight value to themicrophones in order to increase the amplitude of the audio signal.

SUMMARY

According to one aspect of the disclosure, a method is providedcomprising: detecting a first acoustic signal by using a microphonearray; detecting a first angle associated with a first incidentdirection of the first acoustic signal; and storing, in a memory, arepresentation of the first acoustic signal and a representation of thefirst angle.

According to another aspect of the disclosure, an electronic device isprovided comprising: a microphone array; a memory; a speaker; and atleast one processor configured to: detect a first acoustic signal byusing a microphone array; detect a first angle associated with a firstincident direction of the first acoustic signal; and store, in a memory,a representation of the first acoustic signal and a representation ofthe first angle.

BRIEF DESCRIPTION OF THE DRAWINGS

The above features and advantages of the present disclosure will be moreapparent from the following detailed description in conjunction with theaccompanying drawings, in which:

FIG. 1 is a block diagram of an example of an electronic device,according to various embodiments of the present disclosure;

FIG. 2 is a flowchart of an example of a process, according toembodiments of the present disclosure;

FIG. 3 is a flowchart of an example of a process, according to variousembodiments of the present disclosure;

FIG. 4 is a diagram of an example of a system implementing the processof FIG. 3, according to various embodiments of the present disclosure;

FIG. 5 is a diagram of an example of a stored audio signal, according tovarious embodiments of the present disclosure;

FIG. 6 is a diagram of an example of a system for rendering audio,according to various embodiments of the present disclosure;

FIG. 7 is a diagram illustrating an example of a rendered audio signal,according to various embodiments of the present disclosure;

FIG. 8 is a flowchart of an example of a process, according to variousembodiments of the present disclosure;

FIG. 9 is a diagram of an example of a system implementing the processof FIG. 8, according to various embodiments of the present disclosure;

FIG. 10 is a diagram of an example of a stored audio signal according tovarious embodiments of the present disclosure;

FIG. 11 is a diagram of a system for rendering recorded audio signals,according to various embodiments of the present disclosure;

FIG. 12 is a flowchart of an example of a process, according to variousembodiments of the present disclosure;

FIG. 13 is a diagram of an example of a system implementing the processof FIG. 12, according to various embodiments of the present disclosure;

FIG. 14 is a diagram of a stored audio signal, according to variousembodiments of the present disclosure;

FIG. 15 is a diagram of an example of a system for rendering a storedaudio signal, according to various embodiments of the presentdisclosure;

FIG. 16 is a diagram illustrating an example a process for recordingaudio, according to various embodiments of the present disclosure;

FIG. 17 is a diagram of an example of a user interface for renderingaudio, according to various embodiments of the present disclosure;

FIG. 18 is a diagram of an example of a user interface for renderingaudio, according to various embodiments of the present disclosure;

FIG. 19 is a diagram of an example of a user interface for renderingaudio, according to various embodiments of the present disclosure;

FIG. 20 is a diagram illustrating an example of a process for recordingaudio, according to various embodiments of the present disclosure; and

FIG. 21 is a diagram of an example of a user interface for renderingaudio, according to various embodiments of the present disclosure.

DETAILED DESCRIPTION

Hereinafter, embodiments of the present disclosure will be described indetail with reference to the accompanying drawings. It will be easilyappreciated by those skilled in the art that various modifications,additions and substitutions are possible in the embodiments disclosedherein, and that the scope of the disclosure should not be limited tothe following embodiments. The embodiments of the present disclosure areprovided such that those skilled in the art completely understand thedisclosure. In the drawings, the same or similar elements are denoted bythe same reference numerals even though they are depicted in differentdrawings.

The expressions such as “include” and “may include” which may be used inthe present disclosure denote the presence of the disclosed functions,operations, and constituent elements and do not limit one or moreadditional functions, operations, and constituent elements. In thepresent disclosure, the terms such as “include” and/or “have” may beconstrued to denote a certain characteristic, number, step, operation,constituent element, component or a combination thereof, but may not beconstrued to exclude the existence of or a possibility of the additionof one or more other characteristics, numbers, steps, operations,constituent elements, components or combinations thereof.

In the present disclosure, the expression “and/or” includes any and allcombinations of the associated listed words. For example, the expression“A and/or B” may include A, may include B, or may include both A and B.

In the present disclosure, expressions including ordinal numbers, suchas “first” and “second,” etc., and/or the like, may modify variouselements. However, such elements are not limited by the aboveexpressions. For example, the above expressions do not limit thesequence and/or importance of the elements. The above expressions areused merely for the purpose of distinguishing an element from the otherelements. For example, a first user device and a second user deviceindicate different user devices although for both of them the first userdevice and the second user device are user devices. For example, a firstelement could be termed a second element, and similarly, a secondelement could be also termed a first element without departing from thescope of the present disclosure.

When a component is referred to as being “connected to” or “accessed by”another component, it should be understood that not only the componentis directly connected or accessed to the other component, but alsoanother component may exist between the component and the othercomponent. Meanwhile, when a component is referred to as being “directlyconnected” or “directly accessed” to other component, it should beunderstood that there is no component therebetween.

The terms used in the present disclosure are only used to describespecific various embodiments, and are not intended to limit the presentdisclosure. Singular forms are intended to include plural forms unlessthe context clearly indicates otherwise.

Unless otherwise defined, all terms including technical and/orscientific terms used herein have the same meaning as commonlyunderstood by one of ordinary skill in the art to which the disclosurepertains. In addition, unless otherwise defined, all terms defined ingenerally used dictionaries may not be overly interpreted.

For example, the electronic device corresponds to a combination of atleast one of the followings: a smartphone, a tablet Personal Computer(PC), a mobile phone, a video phone, an e-book reader, a desktop PC, alaptop PC, a netbook computer, a Personal Digital Assistant (PDA), aPortable Multimedia Player (PMP), a digital audio player (e.g., MP3player), a mobile medical device, a camera, or a wearable device.Examples of the wearable device are a head-mounted-device (HMD) (e.g.,electronic eyeglasses), electronic clothing, an electronic bracelet, anelectronic necklace, an appcessory, an electronic tattoo, a smart watch,etc.

The electronic device according to the embodiments of the presentdisclosure may be smart home appliances. Examples of the smart homeappliances are a television (TV), a Digital Video Disk (DVD) player, anaudio system, a refrigerator, an air-conditioner, a cleaning device, anoven, a microwave oven, a washing machine, an air cleaner, a set-topbox, a TV box (e.g., Samsung HomeSync™, Apple TV™, or Google TV™), agame console, an electronic dictionary, an electronic key, a camcorder,an electronic album, or the like.

The electronic device according to the embodiments of the presentdisclosure may include at least one of the following: medical devices(e.g., Magnetic Resonance Angiography (MRA), Magnetic Resonance Imaging(MRI), Computed Tomography (CT), a scanning machine, an ultrasonicscanning device, etc.), a navigation device, a Global Positioning System(GPS) receiver, an Event Data Recorder (EDR), a Flight Data Recorder(FDR), a vehicle infotainment device, an electronic equipment for ships(e.g., navigation equipment, gyrocompass, etc.), avionics, a securitydevice, a head unit for vehicles, an industrial or home robot, anautomatic teller's machine (ATM), a point of sales (POS) system, etc.

The electronic device according to the embodiments of the presentdisclosure may include at least one of the following: furniture or aportion of a building/structure, an electronic board, an electronicsignature receiving device, a projector, various measuring instruments(e.g., a water meter, an electric meter, a gas meter and a wave meter),etc. respectively. The electronic device according to the embodiments ofthe present disclosure may also include a combination of the deviceslisted above. In addition, the electronic device according to theembodiments of the present disclosure may be a flexible device. It isobvious to those skilled in the art that the electronic device accordingto the embodiments of the present disclosure is not limited to theaforementioned devices.

Hereinafter, electronic devices according the embodiments of the presentdisclosure are described in detail with reference to the accompanyingdrawings. In the description, the term a ‘user’ may be referred to as aperson or a device that uses an electronic device, e.g., an artificialintelligent electronic device.

FIG. 1 is a block diagram of an example of an electronic device,according to various embodiments of the present disclosure. Referring toFIG. 1, the electronic device 100 may include a controller 110, amicrophone unit 130, a speaker 140, a memory 160, and a communicationunit 180. The controller 110 may control overall operations of theelectronic device 100 and the signal traffic between internal elementsof the electronic device 100, and may perform a data processingfunction. For example, the controller 110 may be formed of a centralprocessing unit (CPU), or an application processor (AP). In addition,the controller 110 may be formed of a single-core processor, or amulti-core processor.

The controller 110 may include at least one processor. Each of theprocessors may include any combination of: one or more general-purposeprocessors (e.g., ARM-based processors, multi-core processors, etc.), aField-Programmable Gate Array (FPGA), an Application-Specific IntegratedCircuit (ASIC), a Digital Signal Processor (DSP), a Programmable LogicDevice (PLD), and/or any other suitable type of processing circuitry.Additionally or alternatively, the controller 110, may include a speakerposition detecting unit 111, a beamformer 113, a pulse-code-modulation(PCM) file creating unit 117, a coder 121, a decoder 123, and a userangle selecting unit 127.

The speaker position detecting unit 111 may find the direction of anaudio signal that has the highest level of energy from among audiosignals received from a plurality of microphones 130. Here, thedirection may be angle information. The speaker position detecting unit111 may recognize the direction to which the speaker currently speaks,using energy information, phase information, or correlation informationbetween the microphones. When a plurality of speakers simultaneouslyspeak, the speaker position detecting unit 111 may recognize the angleinformation in order of the intensity of energy of the audio signalscreated by the speakers.

The beamformer 113 may give weight values to the microphones to increasethe amplitude of the audio signal so that beams, which are able tospatially reduce the related noise when the direction of the audiosignal and the direction of the noise are different from each other.

With regard to the formation of the beams, a sound wave created in thesound source travels a different distance to each microphone. Since thesound wave has a limited speed, the sound wave will reach eachmicrophone at a different time instant. However, apart from the timedifference, the sound waves created from the same sound source may berecognized as the same wave at each microphone. Therefore, if theposition of the sound source is given, the arriving time difference ofthe sound wave may be calculated for the correction thereof to therebymake the waves match each other.

The PCM file creating unit 117 may convert the audio signals input froma plurality of microphones 130 into PCM files. Here, the PCM file refersto the file that is stored as a digital signal converted from an analogsignal, i.e., the audio signal. If the analog signal is stored withoutthe conversion, it may be affected by the noise, so the analog signal isto be converted into the digital signal to then be stored. The createdPCM file may be transmitted to a D/A converter. The D/A converter mayconvert the digital signal into the analog signal. The PCM file may beconverted into the analog file through the D/A converter, and theconverted audio signal may be finally transmitted to the speaker 140 tobe thereby output to the user.

The coder 121 may store the recorded audio signal as a compressed fileusing a codec in order to reduce the storage capacity of the audiosignal that has been converted into the digital signal. The coder 121may receive the angle information corresponding to the speaker from thespeaker position detecting unit 111, and may store the same togetherwith the recorded audio signal corresponding thereto.

The decoder 123 may decompress the file compressed through the coder121. The user angle selecting unit 127 may recognize the angle selectionof the user. The user angle selecting unit 127 may recognize the speakerselection of the user as well as the angle selection. If the user wishesto hear the audio signal of the speaker “B,” or the audio signal of 90°that is mapped with the speaker “B,” the user angle selecting unit 127may select the speaker “B,” or 90°. The user may select the same in alist or through a specific user interface (UI).

The microphone unit 130 may include a plurality of microphones. One ormore microphones may receive the audio signals. The received audiosignal may be recorded by the controller 110, and may be used incalculating the position of the speaker.

The speaker 140 may reproduce the audio signal received through at leastone microphone. The audio signal may be reproduced by the instruction ofthe controller 110 according to the user's selection.

A touch screen 150 may receive the angle information from the user angleselecting unit 127 of the controller 110, and may display the same.Here, the angle information is stored as a file in the memory 160together with the audio signal corresponding thereto. The touch screen150 may detect the user's selection for one or more of the displayedangles, and may transfer the selected angle to the user angle selectingunit 127.

In addition, the touch screen 150 may receive a recorded audio signallist from the controller 110. The touch screen 150 may display thereceived recorded audio signal list. The touch screen 150 may receivetext which is generated based on the audio signal associated with aspecific speaker. The text may be generated by using a text-to-speech(TTS) by the controller 110. The recorded audio signal list may permitthe user to the content of each audio signal.

The memory 160 may include at least one of an internal memory or anexternal memory. The internal memory, for example, may include at leastone of a volatile memory {e.g., a DRAM (dynamic random access memory),an SRAM (static random access memory), an SDRAM (synchronous dynamicrandom access memory, or the like}, a non-volatile memory {e.g., anOTPROM (one time programmable read-only memory), a PROM (programmableread-only memory), an EPROM (erasable and programmable read-onlymemory), an EEPROM (electrically erasable and programmable read-onlymemory), a mask read-only memory, a flash read-only memory, or thelike}, an HDD (hard disk drive), or a solid-state drive (SSD). Theexternal memory may include at least one of a CF (compact flash), SD(secure digital), Micro-SD (micro secure digital), Mini-SD (mini securedigital), xD (extreme digital), a memory stick, a network-accessiblestorage (NAS), a cloud storage or the like. The memory 160 may store theaudio file compressed by the coder 121.

The communication unit 180 may connect the electronic device 100 withexternal electronic devices. For example, the communication unit 180 maybe connected to a network through wireless or wired communication tothereby communicate with the external electronic devices. The wirelesscommunication may include Wi-Fi, BT (Bluetooth), NFC (near fieldcommunication), or the like. In addition, the wireless communication mayinclude at least one selected from among the cellular communicationnetworks (e.g., LTE, LTE-A, CDMA, WCDMA, UMTS, WiBro, GSM, or the like).For example, the wired communication may include at least one of a USB(universal serial bus), an HDMI (high definition multimedia interface),RS-232 (recommended standard 232), or a POTS (plain old telephoneservice).

FIG. 2 is a flowchart of an example of a process, according toembodiments of the present disclosure. Referring to FIG. 2, thecontroller 110 may recognize a user's request to begin the audiorecording. In operation 203, the controller 110 may identify a pluralityof angles. For example, the plurality of angles may be the angles ofaudio signals to be received. In some implementations, the controller110 may map each of the received audio signals to a different one of aplurality of angles at an interval of 90 degrees, i.e., at the angles of0°, 90°, 180°, and 270°, to thereby store the same. For example, thecontroller 110 may receive the audio signals from four microphones todetect the position of the speaker using energy information, phaseinformation, or correlation information between the microphones. Ininstances in which the controller 110 recognizes that the position ofthe speaker is 80°, the controller 110 may configure the position of thespeaker as 90°, which is the relatively approximate value compared withother angles.

In operation 205, the controller 110 may receive a plurality of audiosignals through a plurality of microphones of the microphone unit 130.

In operation 207, the controller 110 may extract the audio signal thathas the highest level of energy from the plurality of audio signalsreceived from the plurality of microphones to thereby detect the angleof the audio signal. In operation 207, the controller 110 may map thedetected angle to one of the plurality of angles identified in operation203. For example, if the controller 110 determines that the audio signalhaving the highest level of energy is received at an angle of 160°, thecontroller 110 may map the audio signal with 180°, which is theapproximate value compared to other angles.

In operation 209, the controller 110 may determine whether angles in theplurality identified in operation 203 have not been processed yet. Forexample, since the controller 110 configures that four audio signals areto be received at an interval of 90° in operation 203, the controller110, which has received one audio signal in operation 207, may determinethat there are three audio signals that have not yet been detected. Ifit is determined that there are angles that have not yet been processed,the controller 110 may proceed to operation 211. In operation 211, thecontroller 110 may detect the angle of the audio signal that has thehighest level of energy from among the remaining audio signals ratherthan the detected audio signal. For example, if the angle of thedetected audio signal is 90°, the audio signal may be mapped with 90°.

The controller 110 may return to operation 209 after detecting the angleof the audio signal that has the highest energy level from among theremaining audio signals in operation 211.

The controller 110 may repeat the operation above, and if all of theconfigured angles are detected, that is, if it is determined that noangle that is not detected exists, the controller 110 may terminate theoperation.

FIG. 3 is a flowchart of an example of a process, according to variousembodiments of the present disclosure. FIG. 4 is a diagram of an exampleof a system implementing the process of FIG. 3, according to variousembodiments of the present disclosure.

The operation of FIG. 3 will be described in association with the signalflow of FIG. 4. In operation 301, the controller 110 may begin recordingaudio. For example, the controller 110 may recognize a user's request tobegin the audio recording. Three microphones of the microphone unit 130shown in FIG. 4 are used. Three A/D converters 410 may convert the audiosignals received from the plurality of microphones into the digitalfiles. The three A/D converters 410 may transfer the audio signals,which have been converted into the digital files, to the controller 110.

In operation 303, the controller 110 may detect the position of thespeaker. That is, the controller 110 may recognize the anglecorresponding to the audio signal, when the audio signal is received. Inoperation 305, the controller 110 may select one of the threemicrophones. Here, the microphones may be omnidirectional microphones.In operation 307, the controller 110 may record the audio signal usingthe selected microphone. In operation 309, the PCM file creating unit117 and the speaker position detecting unit may receive the audiosignal, which has been converted into the digital signals, from the A/Dconverter 410. The coder 121 of the controller 110 may encode the angleinformation, which is received from the speaker position detecting unit111, the PCM file containing the audio signal. In addition, the coder121 of the controller 110 may also encode time information into the PCMfile. The time information may include a period of time for recordingthe audio signal, or the start time and the end time of the recording.The coder 121 of the controller 110 may transfer the compressed audiofile to the memory 160 to store the same therein.

FIG. 5 is a diagram of an example of a stored audio signal, according tovarious embodiments of the present disclosure.

FIG. 5 shows the file recorded as a result of executing the process ofFIG. 3, and the horizontal axis in FIG. 5 denotes time in which the unitmay be a second. In addition, the vertical axis thereof denotes themagnitude of the audio signal in which the unit may be a decibel (dB).FIG. 5 shows an example in which the audio signals corresponding toseveral angles are stored as a single file. It shows that the audiosignals, and the angles, at which the audio signals are received, arestored together. In addition, it shows that the recording time of eachaudio signal is stored as well. The recording time may be expressed asthe length of the section for the audio signal of each speaker in thefile.

Referring to the recorded file, the audio signal A (510 a) occurs at anangle of 0° (520 a). The audio signal B (510 b) occurs at an angle of90° (520 b). The audio signal C (510 c) occurs at an angle of 180° (520c). The audio signal D (510 d) occurs at an angle of 270° (520 d).Comparing the section of the audio signal A with the section of theaudio signal B, the section of the audio signal A (510 a) is shorterthan the section of the audio signal B (510 b). This means that therecording time for the audio signal A (510 a) is shorter than therecording time of the audio signal B (510 b).

FIG. 6 is a diagram of an example of a system for rendering audio,according to various embodiments of the present disclosure.

Referring to FIG. 6, the controller 110 may receive the compressed andstored audio file from the memory 160. The controller 110 may transferthe compressed audio file to the decoder 123. In addition, thecontroller 110 may transfer the angle information corresponding to thecompressed audio file to the user angle selecting unit 127. The userangle selecting unit 127 may transfer the angle information to the touchscreen 150. The touch screen 150 may display all angles identified bythe angle information to allow the user to select at least one thereof.The touch screen 150 may transfer the angle selected by the user to theuser angle selecting unit 127. The user angle selecting unit 127 maytransfer the angle selected by the user to the PCM file creating unit117. The PCM file creating unit 117 may transform only the audio signalcorresponding to the selected angle into a PCM file, and may transferthe same to the D/A converter.

The D/A converter 610 may convert the PCM file into an analog signal andfeed the analog signal to the speaker 140. The D/A converter 610 maytransfer the converted audio signal to the speaker 140, and the speaker140 may output the audio signal.

FIG. 7 is a diagram illustrating an example of a rendered audio signal,according to various embodiments of the present disclosure.

FIG. 7 shows the reproduced audio signal, and the horizontal axisdenotes the time in which the unit may be a second. In addition, thevertical axis denotes the magnitude of the audio signal in which theunit may be a decibel (dB). When the user wishes to listen to only theaudio signal at an angle of 90° (520 b), the audio signal 510 bcorresponding to the angle of 90° among all of the audio signals isreproduced. That is, the audio signals corresponding to the anglesrather than 90° may not be reproduced. If the controller 110 recognizesthe user's selection for the audio signal of 180°, the controller 110may reproduce only the audio signal corresponding to the angle of 180°among all of the files.

FIG. 8 is a flowchart of an example of a process, according to variousembodiments of the present disclosure. FIG. 9 is a diagram of an exampleof a system implementing the process of FIG. 8, according to variousembodiments of the present disclosure.

The operation of FIG. 8 will be described in association with the signalflow of FIG. 9. In operation 801, the controller 110 may perform theaudio recording. The controller 110 may recognize a user's request tobegin the audio recording. As shown in FIG. 9, three microphones may beused by the controller 110 to receive audio signals. Three A/Dconverters 910 may convert the audio signals received from the pluralityof microphones into digital files. The three A/D converters 910 maytransfer the audio signals, which have been converted into the digitalfiles, to the controller 110.

In operation 803, the controller 110 may detect the position of thespeaker. For example, the controller 110 may recognize the anglecorresponding to a received audio signal. As shown in FIG. 9, the audiosignals received by the microphone are converted into the digitalsignals through the A/D converters 910 to be then transferred to thespeaker position detecting unit 111. The speaker position detecting unit111 may recognize the angles corresponding to the received audiosignals, and may transfer information corresponding to the angles to thebeamformer 113.

In operation 805, the beamformer 113 of the controller 110 may form abeam at the detected angle of the speaker. In instances in which severalaudio signals are received at different angles through the microphones,the beamformer 113 may form a beam at an angle of the audio signal thathas the highest energy level. In operation 807, the controller 110 maystore the audio signal recorded by forming the beam, and angleinformation and time information corresponding thereto.

In operation 809, the controller 110 may determine whether or not theposition of the speaker has changed. The speaker position detecting unit111 may recognize the angle of a received audio signal to therebydetermine that the position of the speaker has changed. If the speakerposition detecting unit 111 of the controller 110 determines that theangle of the received audio signal, i.e., the angle of the speaker, ischanged, the controller may return to operation 803. If the speakerposition detecting unit 111 of the controller 110 determines that theangle of the speaker is not changed, the controller may return tooperation 805.

As shown in FIG. 9, the beamformer 113 of the controller 110 maytransfer the audio signal, which is obtained by implementing the beam,to the PCM file creating unit 117. The PCM file creating unit 117 of thecontroller 110 may create the audio signal received from the beamformer113 as a PCM file to transfer the same to the coder 121. In operation809, the coder 121 may compress the PCM file and the angle informationreceived from the speaker position detecting unit 111 to create an audiofile. In addition, the coder 121 of the controller 110 may compress thetime information of the received audio signal in the audio file as well.The coder 121 may store the compressed audio file in the memory 160.

FIG. 10 is a diagram of an example of a stored audio signal, accordingto various embodiments of the present disclosure. FIG. 10 shows the filerecorded through the operation of FIG. 8, and the horizontal axisthereof denotes the time in which the unit may be a second. In addition,the vertical axis thereof denotes the magnitude of the audio signal inwhich the unit may be a decibel (dB). FIG. 5 shows an example in whichthat the audio signals corresponding to several angles are stored as asingle file. In this example, the audio signals, which are receivedthrough the beamforming, and the angles, at which the audio signals arereceived, are stored together. In addition, the recording time of eachaudio signal may be stored, in the file as well. The recording time maybe expressed as the length of the section for the audio signal of eachspeaker in the file.

Referring to the recorded file, the audio signal A (1010 a) occurs at anangle of 0° (1020 a). The audio signal B (1010 b) occurs at an angle of90° (1020 b). The audio signal C (1010 c) occurs at an angle of 180°(1020 c). The audio signal D (1010 d) occurs at an angle of 270° (1020d). Comparing the section of the audio signal A (1010 a) with thesection of the audio signal B (1010 b), the section of the audio signalA (1010 a) is shorter than the section of the audio signal B (1010 b).This means that the recording time for the audio signal A (1010 a) isshorter than the recording time of the audio signal B (1010 b).

FIG. 11 is a diagram of a system for rendering recorded audio signals,according to various embodiments of the present disclosure.

Referring to FIG. 11, the user angle selecting unit 127 of thecontroller 110 may receive angle information corresponding to each audiosignal from the memory 160. The decoder 123 of the controller 110 mayreceive the compressed audio file from the memory 160, and maydecompress the same. The PCM file creating unit 117 of the controller110 may receive the audio signal from the decoder 123, and may transformthe same into a PCM file. The audio signal transformed by the PCM filecreating unit 117 may be transferred to the D/A converter 1110 so thatthe angle information is received from the user angle selecting unit 127and only the audio signal corresponding to the angle is to bereproduced.

The D/A converter 1110 may convert the PCM file of a digital signal intoan analog signal and feed the analog signal to the speaker 140. The D/Aconverter 1110 may transfer the converted audio signal to the speaker140, and the speaker 140 may output the audio signal.

FIG. 12 is a flowchart of an example of a process, according to variousembodiments of the present disclosure. FIG. 13 is a diagram of anexample of a system for implementing the process of FIG. 12, accordingto various embodiments of the present disclosure.

The operation of FIG. 12 will be described in association with thesignal flow of FIG. 13. In operation 1201, the controller 110 may beginrecording audio. For example, the controller 110 may recognize a user'srequest to begin the audio recording. As shown in FIG. 13, threemicrophones are used by the controller 110 to receive the audio signals.A plurality of A/D converters 1310 may convert the audio signalsreceived from three microphones into digital files. Three A/D converters1310 may transfer the audio signals, which have been converted into thedigital files, to the controller 110.

In operation 1203, the controller 110 may detect the positions of aplurality of speakers. That is, when a plurality of audio signals isreceived, the controller 110 may recognize the angles corresponding tothe audio signals. As shown in FIG. 13, the audio signals received bythe three microphones are converted into digital signals by the A/Dconverter 1310 to be then transferred to the speaker position detectingunit 111. The speaker position detecting unit 111 may recognize theangles corresponding to the received audio signals, and may transfer anindication of each angle to the beamformers 113 a to 113 c.

In operation 1205, the beamformers 113 a to 113 c of the controller 110may form beams at each all of the detected angles, respectively. Inaddition, the beamformers 113 a to 113 c of the controller 110 may formthe beams only at angles of the audio signals that have greater energiesthan a predetermined value. As shown in FIG. 13, the beamformers 113 ato 113 c of the controller 110 may transfer the audio signals, which areobtained by implementing the beams, to the PCM file creating units 117 ato 117 c. The PCM file creating units 117 a to 117 c of the controller110 may transform the audio signals received from the beamformers 113 ato 113 c into the PCM files to transfer the same to the coder 121. Inoperation 1207, the coder 121 may create audio files by associating thePCM files with a plurality of pieces of the angle information receivedfrom the speaker position detecting unit 111 to thereby compress thesame. In addition, the coder 121 of the controller 110 may compress thetime information of the received audio signals in the audio file aswell. The coder 121 may store the compressed audio files in the memory160.

FIG. 14 is a diagram of a stored audio signal, according to variousembodiments of the present disclosure.

FIG. 14 shows the file recorded through the operation of FIG. 12, andthe horizontal axis thereof denotes the time in which the unit may be asecond. In addition, the vertical axis thereof denotes the magnitude ofthe audio signal, in which the unit may be a decibel (dB). FIG. 14 showsan example in which the audio signals corresponding to the angles arestored as respective files. In addition, it is assumed that the audiosignals of the files are recorded in an order of time in FIG. 14. In theexample FIG. 14, the audio signals received through the beamforming, andthe angles, at which the audio signals are received, may be storedtogether. In addition, it shows that the recording time of each audiosignal is stored as well. The recording time may be expressed as thelength of the section for the audio signal of each speaker in the file.

Referring to the recorded file, the audio signal A (1410 a) stored inFile 1 occurs at an angle of 0° (1420 a). The audio signal B (1410 b)stored in File 2 occurs at an angle of 90° (1420 b). The audio signal C(1410 c) stored in File 3 occurs at an angle of 180° (1420 c). The audiosignal D (1410 d) stored in File 4 occurs at an angle of 270° (1420 d).

In addition, although it is not shown in the drawing, the respectiverepresentations of all audio signals may be encapsulated in the samefile. For example, when another audio signal occurs at the angle of 0°(1420 a), another audio signal 1410 a may be stored in File 1. Ifanother audio signal additionally occurs after the audio signal 1410 dis stored, the additionally created audio signal may be stored after theaudio signal 1410 d in File 1. In addition, if another audio signaladditionally occurs in the middle of storing the audio signal 1410 c,the additionally created audio signal may be stored at the same time asthe audio signal 1410 c of the speaker C (1401 c) in File 1.

FIG. 15 is a diagram of an example of a system for rendering a storedaudio signal, according to various embodiments of the presentdisclosure.

Referring to FIG. 15, the user angle selecting unit 127 of thecontroller 110 may receive the position information, i.e., the angleinformation corresponding to the speaker from the memory 160. The userangle selecting unit 127 may transfer the received angle information tothe touch screen 150, and the touch screen 150 may display the anglescorresponding to the received angle information. The user angleselecting unit 127 may recognize the angle selected by the user on thetouch screen 150. The user angle selecting unit 127 may transfer theselected angle to the decoder 123, and the decoder 123 may receive onlythe file corresponding to the selected angle from the memory 160. Thedecoder 123 may decompress the received file, and may perform the bufferand mixing process 1570 with respect to the file corresponding to theangle selected by the user angle selecting unit 127. The controller 110may transfer the processed file to the PCM file creating unit 117, andthe PCM file creating unit 117 may transform the transferred file to aPCM file. The file created by the PCM file creating unit 117 may betransferred to the D/A converter 1510. The D/A converter 1510 mayconvert the PCM file of the digital signal into an analog signal andfeed the analog signal to the speaker 140. The D/A converter 1510 maytransfer the converted audio signal to the speaker 140, and the speaker140 may output the audio signal.

FIG. 16 is a diagram illustrating an example a process for recordingaudio, according to various embodiments of the present disclosure. Threemicrophones may be arranged in different directions from each other. Oneor more beams may be formed through a combination of three microphones.

As shown in the drawing, three microphones 1641, 1642, and 1643 aredisposed in different directions from each other, and four beams 1611,1612, 1613, and 1614 may be formed through a combination of the threemicrophones 1641, 1642, and 1643. Each of the beams 1611, 1612, 1613,and 1614 may receive the audio signal only at its formed angle. Thereceived audio signals may be stored together with angle informationcorresponding thereto.

FIG. 17 is a diagram of an example of a user interface for renderingaudio, according to various embodiments of the present disclosure.

Referring to FIG. 17, the controller 110 may display a UI on the touchscreen 150, which allows the user to reproduce an audio signal that isassociated with a desired direction. In an embodiment, the UI mayinclude identifiers, which indicate the locations of the speakersrelative to a microphone array used to record the sound produced by thespeakers. The identifiers may be displayed on the circle to correspondto the angles of the speakers. As shown in the drawing, an identifier A(1701 a), an identifier B (1701 b), an identifier C (1701 c), and anidentifier D (1701 d) are displayed at the positions corresponding to0°, 90°, 180°, and 270°, which may be approximate locations of thespeakers relative to the microphone array.

If the user selects at least one of the identifiers, the controller 110may reproduce the audio file associated with the angle corresponding tothe identifier. In addition, if the user selects the all-play button1750, the controller 110 may reproduce all of the audio files throughthe speaker. All of the audio files may be the files that include theaudio signals at all angles.

FIG. 18 is a diagram of an example of a user interface for renderingaudio, according to various embodiments of the present disclosure.

Referring to FIG. 18, the controller 110 may display a list that allowsthe user to select an audio signal that is associated with a desireddirection. The list may include an identifier that indicates thespeaker, a play button 1850, a stop button 1860, and a recording time1870. If the user selects one of the identifiers 1801 a to 1801 d, thecontroller 110 may reproduce the stored audio file corresponding to theselected identifier through the speaker 140. For example, when the userselects the play button 1850 in order to listen to the audio signal ofthe identifier A (1801 a), the controller 110 may reproduce the storedaudio file associated with the identifier 1801 a for 3 min 40 sec.

In addition, when one of the identifiers is selected by the user, thecontroller 110 may provide section information corresponding to theselected identifier. The section information may be the informationindicating the start time and the end time of the recorded audio signalof the speaker corresponding to the selected identifier among the entirerecording time. The controller 110 may express the section informationas images or numbers.

For example, when the user selects the identifier A (1801 a), thecontroller 110 may provide the section information corresponding to theselected identifier A (1801 a). The section information of theidentifier A (1801 a) may be the information stating that the audiosignal is recorded from the time of 3 min to the time of 6 min 40 sec ofthe whole recording time of 27 min 35 sec. The controller 110 mayprovide the section information when the user selects the identifier A(1801 a), or may display the section information in the list or in thereproduced image when the recording time is selected or while the audiofile is reproduced.

FIG. 19 is a diagram of an example of a user interface for renderingaudio, according to various embodiments of the present disclosure.

The controller 110 may identify the speakers of the recorded audiosignals as well as the audio signals according to the angles. To thisend, the controller 110 may pre-store speaker recognition informationusing a sound-shot function before performing the audio recording. Thespeaker recognition information may include the waves of the audiosignals and photos of the speakers. The sound-shot function refers tothe function of storing the audio signal recorded when taking a photo,together with the photo.

For example, if the user photographs the face of the speaker A (1900 a)and records the audio signal 1910 a of the speaker using the sound-shotfunction, the controller 110 may map the photo with the audio signal tothereby store the same as a single audio file 1901 a in the memory 160.As shown in FIG. 19, the photos of the speaker A (1900 a), the speaker B(1900 b), the speaker C (1900 c), and the speaker D (1900 d) may bestored together with the audio signal wave 1910 a of the speaker A (1900a), the audio signal wave 1910 b of the speaker B (1900 b), the audiosignal wave 1910 c of the speaker C (1900 c), and the audio signal wave1910 d of the speaker D (1900 d) as files 1901 a to 1901 d,respectively. The audio signal waves may be distinct from each otherdepending on the features of the human voice, so the audio signal wavemay be used to identify the speakers.

In another embodiment, in order to recognize the speakers, the user maypre-store the voices of the speakers as the speaker recognitioninformation before the recording of the audio signals. According tothis, the controller 110 may record the voices of the speakers to bestored in the memory 160, and may use the same for the comparison later.Additionally or alternatively, when storing the voices of the speakers,the user may also store the names of the speakers, and/or otherinformation that can be used to indicate the speakers' identities.

In another embodiment, during a phone call with those who are stored inthe contact information, the controller 110 may store the voices of thespeakers in the memory 160 to use the same as the speaker recognitioninformation.

FIG. 20 is a diagram of an example of a process for recording audio,according to various embodiments of the present disclosure.

As mentioned in FIG. 19, the controller 110 may take photos of thespeakers, and may pre-store the photos and the audio signals in thememory 160 using the sound-shot function, in order to identify thespeaker of the recorded audio signal according to the angles. Referringto FIG. 20, the controller 110 may compare the waves of the audiosignals stored at the angles with the audio signal waves of thesound-shot files stored in the memory 160. If the controller 110 findsthe sound-shot file that has an audio signal wave that matches the waveof the stored audio signal at each angle, the controller 110 may map thephoto of the sound-shot file with the audio signal stored at each angleto thereby store the same. For example, as shown in FIG. 20, the speakerA (2001 a), the speaker B (2001 b), the speaker C (2001 c), and thespeaker D (2001 d) may form the beams 2011 to 2014 to receive the audiosignals of the speakers, respectively. The memory 160 may have thephotos and the audio signals of the speakers 2001 a to 2001 d. Thecontroller 110 may compare the received audio signal waves of thespeakers with the audio signal waves stored in the memory 160 to therebymap the same to match each other to be then stored.

In another embodiment, the controller 110 may compare the received audiosignal waves of the speakers with the audio signal waves that have beenpre-recorded and pre-stored for the comparison. The controller 110 maycompare the received audio signal waves of the speakers with the audiosignal waves stored in the memory 160 to determine the respectiveidentities of the speakers.

In another embodiment, the controller 110 may compare the received audiosignal waves of the speakers with the audio signal waves of the userswho are represented in the contact information. The controller 110 maycompare the received audio signal waves of the speakers with the audiosignal waves stored in the memory 160 to determine the identities of thespeakers.

Referring to the files recorded according to the various embodimentsabove, the audio signal A (2010 a) stored in File 1 occurs at an angleof 0° (2020 a) by the speaker A (2001 a). The audio signal B (2010 b)stored in File 2 occurs at an angle of 90° (2020 b) by the speaker B(2001 b). The audio signal C (2010 c) stored in File 3 occurs at anangle of 180° (2020 c) by the speaker C (2001 c). The audio signal D(2010 d) stored in File 4 occurs at an angle of 270° (2020 d) by thespeaker D (2001 d).

FIG. 21 is a diagram of an example of a user interface for renderingaudio, according to various embodiments of the present disclosure.

As mentioned in FIG. 20, the audio files may be stored according to thespeaker through the speaker recognition. The controller 110 may createdocuments with respect to the files stored according to the speakers,using a speech-to-text (STT) function.

As shown in FIG. 21, the controller 110 may create the minutes 2100 asone of the documents. The minutes 2100 may include identifiers 2101 orphotos of the speakers for identifying the speakers, STT-transformedtext 2103, the recording time of the audio file 2105, and play buttons2107 for reproducing the audio file. For example, the controller 110 maytransform the audio file of the speaker A (2101 a), which is recordedfirst, into the text, and may record the same in the minutes 2100ordered by time. The controller 110 may include the play button 2107 forreproducing the audio file corresponding to the recording time 2105 of“00:00:00˜00:00:34” in the minutes 2100.

FIGS. 1-21 are provided as an example only. At least some of the stepsdiscussed with respect to these figures can be performed concurrently,performed in a different order, and/or altogether omitted. It will beunderstood that the provision of the examples described herein, as wellas clauses phrased as “such as,” “e.g.”, “including”, “in some aspects,”“in some implementations,” and the like should not be interpreted aslimiting the claimed subject matter to the specific examples.

The above-described aspects of the present disclosure can be implementedin hardware, firmware or via the execution of software or computer codethat can be stored in a recording medium such as a CD-ROM, a DigitalVersatile Disc (DVD), a magnetic tape, a RAM, a floppy disk, a harddisk, or a magneto-optical disk or computer code downloaded over anetwork originally stored on a remote recording medium or anon-transitory machine-readable medium and to be stored on a localrecording medium, so that the methods described herein can be renderedvia such software that is stored on the recording medium using a generalpurpose computer, or a special processor or in programmable or dedicatedhardware, such as an ASIC or FPGA. As would be understood in the art,the computer, the processor, microprocessor controller or theprogrammable hardware include memory components, e.g., RAM, ROM, Flash,etc. that may store or receive software or computer code that whenaccessed and executed by the computer, processor or hardware implementthe processing methods described herein. In addition, it would berecognized that when a general purpose computer accesses code forimplementing the processing shown herein, the execution of the codetransforms the general purpose computer into a special purpose computerfor executing the processing shown herein. Any of the functions andsteps provided in the Figures may be implemented in hardware, softwareor a combination of both and may be performed in whole or in part withinthe programmed instructions of a computer. No claim element herein is tobe construed under the provisions of 35 U.S.C. 112, sixth paragraph,unless the element is expressly recited using the phrase “means for”.

While the present disclosure has been particularly shown and describedwith reference to the examples provided therein, it will be understoodby those skilled in the art that various changes in form and details maybe made therein without departing from the spirit and scope of thepresent disclosure as defined by the appended claims.

What is claimed is:
 1. A mobile communication device comprising: adisplay; memory to store audio data, the audio data including a firstand second sound signals received from an outside of the mobilecommunication device, the first sound signal corresponding to a firstdirection with respect to the mobile communication device, the secondsound signal corresponding to a second direction with respect to themobile communication device, the first sound signal to be output in afirst section of the audio data, the second sound signal to be output ina second section of the audio data; and a processor adapted to:reproduce the audio data; present, via a first graphic user interface afirst indication at a first position associated with the firstdirection, and a second indication at a second position associated withthe second direction, in relation with the reproducing; and present, viaa second graphic user interface, a third indication indicative of thefirst section and a fourth indication indicative of the second section,in relation with the reproducing.
 2. The mobile communication device ofclaim 1, wherein: the processor is adapted to: adjust, a volume of acorresponding one of the first sound signal and the second sound signal,based at least in part on another user input from the first graphicaluser interface.
 3. The mobile communication device of claim 2, whereinthe processor is adapted to: mute the volume, as at least part of theadjusting.
 4. The mobile communication device of claim 2, wherein theprocessor is adapted to: deactivate a corresponding one of the firstsection and the second section, based at least in part on the userinput, as at least part of the muting.
 5. The mobile communicationdevice of claim 2, wherein the processor is adapted to: change, via thedisplay, at least one visual characteristic, a text, a number, or animage of a corresponding one of the first indication and the secondindication, based at least in part on the user input.
 6. The mobilecommunication device of claim 2, wherein the first sound signalcorresponding to the first direction with respect to the mobilecommunication device and the second sound signal corresponding to thesecond direction with respect to the mobile communication device,wherein the processor is adapted to: display a graphic user interfaceincluding the substantially circular element in relation with thereproducing, the graphic user interface including a first element havinga first angle indicative of the first direction and a second elementhaving a second angle indicative of the second direction.